Efficient Optimal Transport Algorithm by Accelerated Gradient Descent

نویسندگان

چکیده

Optimal transport (OT) plays an essential role in various areas like machine learning and deep learning. However, computing discrete optimal plan for large scale problems with adequate accuracy efficiency is still highly challenging. Recently, methods based on the Sinkhorn algorithm add entropy regularizer to prime problem get a trade off between accuracy. In this paper, we propose novel further improve Nesterov's smoothing technique. Basically, non-smooth c-transform of Kantorovich potential approximated by smooth Log-Sum-Exp function, which finally smooths original dual functional (energy). The can be optimized fast proximal gradient (FISTA) efficiently. Theoretically, computational complexity proposed method given $O(n^{\frac{5}{2}} \sqrt{\log n} /\epsilon)$, lower than that algorithm. Empirically, compared algorithm, our experimental results demonstrate achieves faster convergence better same parameter.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Computational Optimal Transport: Complexity by Accelerated Gradient Descent Is Better Than by Sinkhorn's Algorithm

We analyze two algorithms for approximating the general optimal transport (OT) distance between two discrete distributions of size n, up to accuracy ε. For the first algorithm, which is based on the celebrated Sinkhorn’s algorithm, we prove the complexity bound Õ ( n ε2 ) arithmetic operations1. For the second one, which is based on our novel Adaptive Primal-Dual Accelerated Gradient Descent (A...

متن کامل

Accelerated Gradient Descent by Factor-Centering Decomposition

Gradient factor centering is a new methodology for decomposing neural networks into biased and centered subnets which are then trained in parallel. The decomposition can be applied to any pattern-dependent factor in the network’s gradient, and is designed such that the subnets are more amenable to optimization by gradient descent than the original network: biased subnets because of their simpli...

متن کامل

Accelerated Gradient Descent Escapes Saddle Points Faster than Gradient Descent

Nesterov's accelerated gradient descent (AGD), an instance of the general family of"momentum methods", provably achieves faster convergence rate than gradient descent (GD) in the convex setting. However, whether these methods are superior to GD in the nonconvex setting remains open. This paper studies a simple variant of AGD, and shows that it escapes saddle points and finds a second-order stat...

متن کامل

Asynchronous Accelerated Stochastic Gradient Descent

Stochastic gradient descent (SGD) is a widely used optimization algorithm in machine learning. In order to accelerate the convergence of SGD, a few advanced techniques have been developed in recent years, including variance reduction, stochastic coordinate sampling, and Nesterov’s acceleration method. Furthermore, in order to improve the training speed and/or leverage larger-scale training data...

متن کامل

Conditional Accelerated Lazy Stochastic Gradient Descent

In this work we introduce a conditional accelerated lazy stochastic gradient descent algorithm with optimal number of calls to a stochastic first-order oracle and convergence rate O( 1 ε2 ) improving over the projection-free, Online Frank-Wolfe based stochastic gradient descent of Hazan and Kale [2012] with convergence rate O( 1 ε4 ).

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Proceedings of the ... AAAI Conference on Artificial Intelligence

سال: 2022

ISSN: ['2159-5399', '2374-3468']

DOI: https://doi.org/10.1609/aaai.v36i9.21251